Using Supertags and Encoded Annotation Principles for Improved Dependency to Phrase Structure Conversion

نویسندگان

  • Seth Kulick
  • Ann Bies
  • Justin Mott
چکیده

We investigate the problem of automatically converting from a dependency representation to a phrase structure representation, a key aspect of understanding the relationship between these two representations for NLP work. We implement a new approach to this problem, based on a small number of supertags, along with an encoding of some of the underlying principles of the Penn Treebank guidelines. The resulting system significantly outperforms previous work in such automatic conversion. We also achieve comparable results to a system using a phrase-structure parser for the conversion. A comparison with our system using either the part-of-speech tags or the supertags provides some indication of what the parser is contributing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تولید درخت بانک سازه‌ای زبان فارسی به روش تبدیل خودکار

Treebanks is one of important and useful resource in Natural Language Processing tasks. Dependency and phrase structures are two famous kinds of treebanks. There have already made many efforts to convert dependency structure to phrase structure. In this paper we study an approach to convert dependency structure to phrase structure because of lack of a big phrase structure Treebank in Persian. A...

متن کامل

ارائۀ راهکاری قاعده‌مند جهت تبدیل خودکار درخت تجزیۀ نحوی وابستگی به درخت تجزیۀ نحوی ساخت‌سازه‌ای برای زبان فارسی

In this paper, an automatic method in converting a dependency parse tree into an equivalent phrase structure one, is introduced for the Persian language. In first step, a rule-based algorithm was designed. Then, Persian specific dependency-to-phrase structure conversion rules merged to the algorithm. Subsequently, the Persian dependency treebank with about 30,000 sentences was used as an input ...

متن کامل

Talbanken05: A Swedish Treebank with Phrase Structure and Dependency Annotation

We introduce Talbanken05, a Swedish treebank based on a syntactically annotated corpus from the 1970s, Talbanken76, converted to modern formats. The treebank is available in three different formats, besides the original one: two versions of phrase structure annotation and one dependency-based annotation, all of which are encoded in XML. In this paper, we describe the conversion process and exem...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Tapping the implicit information for the PS to DS conversion of the Chinese Treebank

We examine the linguistic adequacy of dependency structure annotation automatically converted from phrase structure treebanks with the head table approach and show this method is far from satisfactory. We propose an alternative approach that better exploits the implicit information in the phrase structure and show these two approaches only agree 60.6% of the time when evaluated against the Chin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012